Separable Architecture for Fault Isolation and Recovery
نویسندگان
چکیده
Fault management is one of the key technologies that enable distributed and disaggregated mission architectures wherein multiple vehicles work cooperatively and autonomously in a cluster or formation, a typical mission concept involving small satellites. In this paper, we describe a software architecture, called Separable Architecture for Fault Isolation and Recovery (SAFIR), which addresses fault management for these types of missions. Although SAFIR is applicable to any system of systems, this paper demonstrates SAFIR for a cluster of spacecraft. The resulting fault detection, isolation, and recovery benefits from the SAFIR architecture because it is robust to intermittent communication and highly modular. The SAFIR software has been developed as apps for the Core Flight System (cFS) and has been demonstrated successfully on representative hardware using a high fidelity simulation of spacecraft in low earth orbit.
منابع مشابه
Fault Isolation and Quick Recovery in Isolation File Systems
Lanyue Lu presented isolation file systems, providing fault isolation and quick recovery within a single file system. Because file systems are important data access interfaces in many environments, high availability is critical; however, a single fault can trigger a large-scale impact for the whole file system, such as remounting as read-only and a system crash. Lanyue explained how a metadata ...
متن کاملObjectAgent for Robust Autonomous Control
The ObjectAgent system is being developed to create a robust software architecture for autonomous control of complex systems. Agents are used to implement all of the software functionality and communicate through simplified natural language messages. These agents have a set of basic survival skills that monitor for internal software faults, providing low-level fault detection and recovery. High...
متن کاملReliability Analysis for Train Control System by Hardware Redundancy Architecture in Fault Tolerance System
Train control system is a vital system due to controlling the speed and interlocking of train in railway. The train control system is designed by double module or triple module system as a vital system. Hardware redundancy means to use additional hardware to defect and tolerant faults. There are three forms of hardware redundancy: passive, active and hybrid. Passive redundancy architecture achi...
متن کاملPerformance Evaluation of Recursive Network Architecture for Fault-tolerance
Network fault tolerance is one of the most important capabilities required by mission-critical systems such as the naval Combat System Data Network (CSDN). In this paper, we present performance evaluation results of a fault-tolerant network scheme called Recursive Scalable Autonomous Faulttolerant Ethernet (RSAFE). The primary goal of RSAFE scheme is to provide network scalability, and autonomo...
متن کاملProtected Ethernet Rings for Optical Access Networks
In this paper we propose a centralized link layer architecture for providing low latency fault recovery for optical access rings. This architecture exploits the naturally uneven breakdown of network management responsibilities between the components of an access ring. Important administrative operations like ring status checking, fault detection and recovery are aggregated at the HUB component ...
متن کامل